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This paper introduces CONFIGEN, a tool that helps modularizing software. CONFIGEN allows the 
developer to select a set of elementary components for his software through an interactive interface. 
Configuration files for use by C/assembly code and Makefiles are then automatically generated, and 
we successfully used it as a helper tool for complex system software refactoring. CONFIGEN is based 
on propositional logic, and its implementation faces hard theoretical problems. 

1 Introduction 

A good way to build secure systems is the top-down approach, where each step refines the software 
towards the final implementation. The result is well- integrated, but quite monolithic. Consequently, 
further extensions often lead to an overuse of preprocessor conditionals and some code duplication. It is 
then important to ref actor and modularize the code, with the goal of increasing maintainability and code 
reuse. 

We are trying to apply this process to the implementation of the OASIS |9| kernel, an execution sup- 
port for hard real-time safety critical applications. Modularizing this software has specific requirements. 
First, the configuration has to be chosen at compile-time (in particular, qualification for use in safety- 
critical environments requires that no dead code remains in the system). Second, modularity should not 
impact the degree of performance, in terms of execution time and memory footprint (for instance, mod- 
ularity should not imply new indirections, like C-i~i- virtual method tables). Thus, the tool should allow 
the static selection of a subset of the code in order to implement a specific behavior. 

CONFIGEN is the tool we built to that end. It is composed of two main parts. The first one is an 
interactive tool that helps selecting correct software options with respect to the dependencies between 
the modules, and is based on propositional logic. The second part builds the source code following the 
set of selected options. 

The paper is divided as follow. Section [2] explains the concepts and goals of CONFIGEN. Section[3] 
provides a set of good practice rules with concrete examples on how to use CONFIGEN, as well as 
our experience using it with the OASIS kernel f9]. Section |4] presents our current prototype, and the 
theoretical problems of its core component, the logic solver. Section [5] presents related works, and 
section [6] concludes. 

2 The CONFIGEN approach 

2.1 Configuration options 

CONFIGEN operates on configuration options, rather than on modules. A module is a part of the code 
which, when associated with its dependencies, is "self-containing", and often has a defined meaning that 
depends on the language (e.g. Java classes, ML modules, C functions and files). Configuration options 
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represent arbitrary pieces of code, which encompass the notion of modules, and are thus more general: 
it can span from several lines of code inside a function to a large set of modules. 

Formally, a configuration option is a couple {v,Sy) of a boolean variable v and a code selector s^. v is 
true when the functionality is present, and false otherwise. The code selector s^ is a function that, given a 
value of V and a code c (a sequence of characters), returns a subsequence of c. A concrete implementation 



of this function is the use of the C preprocessor to eliminate conditional code (see section 2.3.2 1. Another 



one is selection in a Makefile of a subset of the files to compile or link (section 2.3.3 1. 

CONFIGEN operates on closed systems, i.e. all the code and configuration options are assumed to be 
known when the system is built. This is a requirement of the "static configuration" approach. 

Once the values for all configuration options v have been chosen, the configured code can be obtained 
by applying all the Sy to the original code. 

2.2 Relations between configuration options 

2.2.1 Basic operators 

Two operators are defined to describe all dependency relations between configuration options in the 
system: 

• The dependency operator, a ^ b, means that the configuration option a can only be present if b is 
present. It is equal to the standard boolean logic implication operator (also written =^). 

• The interface/implementations operator, written a : i\ I/2I • • • |/«, means two things: 

1. if the interface a is false, then all of the implementations /i . . . /„ are false; 

2. if a is true, exactly one of /i , /2, • ■ • , in is true. 

The interface/implementation operator can be expressed by the following logical formula: 

{^a^ [\ -^iAy[a^ \/ ik a {/\i^i,^ii 

\ l<k<n J \ l<k<n 

In fact, only this last operator is formally needed, because it is functionally complete^ But the use 
of the dependency operator makes things simpler for both the user and the logic simplifier. 

In our system the complete relationship between the configuration options can be written as a con- 
junction of formulas which either use the interface/implementation operator on several literals, or the 
dependency operator on two literals. For convenience, we also allow the use of the A operator on the 
right side of a =^ operator. Other operators may be added in the future, but as of now, we do not believe 
that -1 or V are useful operators. We believe that restricting the number of operators is simpler for the 
developer and encourages good software practices, as described in section [3] 

2.2.2 Textual and graphical representation 

One of the main interests of using only these two operators is that they allow nice textual and graphical 
interfaces. In CONFIGEN, the user specifies its dependencies in a special "deps" file, whose corqjBNF 
is simple: 

'Proof: -ixis (x : x|x), is (-1X : x|x), xAy is (0 : (-ixl-i)))) 

The complete BNF allows some extensions, as seen sections 2.3.3 and 3.2 
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<deps> : := { <dep_line> | <iface_line> } * 

<dep_line> : := <id> "->" <id> { "&" <id> }* "\n" 

<iface_line> : := <id> ":" <id> { "|" <id> }* "\n" 
<id> ::= [a-z] [a-z0-9_] * 

Each id represents a configuration option (i.e. a boolean variable), and lines can either express a de- 
pendency relation or an interface/implementation relation. The whole program relation is the conjunction 
of the relations in each line. Figure l(a)| presents an example of this textual representation. 



The relations between configuration options using our operator also admits a nice graphical rep- 
resentation. This representation is a graph where nodes represent configuration options, and arrows 
dependency relations. An interface is a box that encloses its implementations. 



Figure 1(b) gives an example on how the scheduler part of our kernel can be modularized. The 
microkernel can run on three different embedded platform, with ARM, PowerPC, or S 12XE processors. 
ARM and PowerPC both have a LL/SC (load-linked/store conditional) instruction, S12XE and PowerPC 
provide hardware spinlocks. Note that spinlocks can also be implemented using LL/SC. At last, the 
scheduler depends on two subsystems, to handle the current clock and a list of contexts, for which two 
versions exist: one that uses spinlocks, and one that uses LL/SC instruction. 

Colors represent valuation of boolean variables, as described in the following section. The interactive 
solver is described in details in section l4~2l 

2.3 Tools and integration with the development environment 

2.3.1 The configuration selector 

A configuration is the assignment of a truth value to all configuration options. The most important 
requirement for a configuration is to be correct, i.e. that all the relations between configuration options 
are satisfied. It is fairly easy to write a program that checks that a given configuration is correct. 

But such manual writing of a configuration would be tedious for the user, all the more because our 



method encourages using many configuration options (see section 3.4 1. Moreover, most options can be 
automatically constrained. 



This explains why the configuration selector is necessary. Figure 1(b) is a part of a screenshot of 
our tool. Its interface is simple: clicking on a node switches the valuation of the corresponding options 
between "enforce true" (dark green), "enforce false" (dark red), and "unenforced". Unenforced options 
can be in different states: "implied true" (light green), meaning that all correct configurations require the 
option to be true; "implied false" (light red), meaning that all correct configurations require the option to 
be false; and "normal" (gray), meaning that there exists correct configuration options where the option is 
true and others where the option is false. The tool warns the user when the enforced values are impossible 
to satisfy, and allows saving the configuration when every option has been assigned a value. 

In the example, the user has explicitly expressed that he wants the sched and arm configuration 
options to be true (in dark green). Had the S12XE platform been selected instead of the ARM one, the 
configuration would be complete, i.e. every configuration option would have been inferred to be either 
true or false. 

We found the use of this tool very intuitive, and that creating a new configuration was fast. 

2.3.2 Generation of a conf ig . h file 

One of the main use of our tool is the generation of a conf ig . h file for use by the C preprocessor. This 



is the concrete implementation of the abstract code selector presented section 2. 1 
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sched -> clock & ctxlist 

clock: clock.llsc | clock.spinlock 

clock_spinlock -> spinlock 

clock.llsc -> Use 

ctxlist: ctxlist.llsc | ctxlist_spinlock 

ctxlist_llsc -> Use 

ctxlist.spinlock -> spinlock 

spinlock: spinlock_ppc | spinlock_sl2xe | spinlock_llsc 

spinlock_llsc -> Use 

Use: llsc.arm | llsc.ppc 

llse_arm -> arm 

llsc.ppc -> powerpc 

spinlock_sl2xe -> sl2xe 

spinlock_ppe -> powerpc 

plateform: powerpc | sl2xe | arm 



(a) Example of textual representation. 



Ctxlist Use 




clock 



clock_spinloclc 



spinlock 



spinlock_sl2xe 



(b) Graphical representation of|l(a)| 



Figure 1 : An example of textual and graphical representation. Each node represents a configuration 
option. Rectangular nodes represent interfaces, and the nodes they encompass are their implementations. 
Arrows represent dependencies. Colors represent a partial resolution of the logic problem: dark green 
nodes have been enforced to be true, light green ones are deduced true, and light red ones are deduced 
false. 
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The conf ig.h file is generated once all the configuration options are assigned a value. For every 
configuration option set to "true", a line #define CONFIG_<conf ig option name > is inserted in this 
file. 

Every file in the project contains a #include <conf ig.h>, and code can be made optional using 
#ifdef CONFIG_<conf ig option name> or #if ndef CONFIG_<conf ig option name>. 

Another advantage of using CONFIGEN is the assurance that configuration options are defined con- 
sistently. In particular, this avoids the problem where a conditional is defined only if another conditional 
is activated. For instance, the use of spinlock is useful only on multiprocessor, which can lead to code 
like this: 



#ifdef CONFIG_SMP 


#if defined (CONFIG_SMP) && 


!defined(CONFIG_SPINLOCK) 


#define CONFIG_SPINLOCK 


// conditional code 




#endif 


#endif 





Thus the user always has to remind to test the CDNFIG_SMP conditional before testing 
CONFIG_SPINLOCK, which is tedious and error-prone. The use of a single, consistent conf ig.h avoids 
all needs for nested preprocessor conditionals. 

With all these problems solved, the use of conditionals in C code becomes much more readable and 
maintainable, and allows for reusable code without sacrificing performance. 

2,3.3 Generation of Makef iles 

Experience shows that selecting code parts using only preprocessor conditionals leads to unreadable 
code. Often, a better way is to perform a selection of the files to be compiled in the build scripts, such as 
Makefiles. 

One way to achieve that would be to use conditionals in the Makefile, but this makes it harder to 
read and more error-prone. A better way is to generate the list of objects and other targets to be built. 
To do that, we have extended CONFIGEN to handle properties, which are information attached to the 
configuration options. Properties are expressed in the dependency file, as in the following example: 

ctxlist.objs = ctxlist_common.o 

ctxlist_spinlock.objs = ctxlist_spinlock. control. o ctxlist_spinlock.exec.o 

microkernel. targets = microkernel 

These configuration options are used to generate a file conf ig.mk: 

all_objs = ctxlist_common.o ctxlist_spinlock. control. o ctxlist_spinlock.exec.o 
all_targets = microkernel 

This file is included in the main Makefile for the application: 

all: $(all_targets) 
microkernel: $(all_objs) 

# Special rules eventually needed to build object files 
'/„ . control . o : % . c 



CONFIGEN can be easily extended to handle new properties, or other build tools than Makefile. 
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3 Usage patterns and good practice 

Proper use of our tools requires to comply with a set of good practices, that help writing more modular 
and understandable code by using the right amount of configuration options. Indeed if creating redundant 
options should be avoided, it is also a bad idea to group independent concepts into a single configuration 
option. The following presents common use cases and the best way to describe them using CONFIGEN. 

3.1 Using configuration options for modular construction 

Decomposition into interface and implementations is a common practice (see the ML module system , or 
C++/Java abstract and concrete classes). An interface helps understanding the specification of a module 
(and how to use it) without needing to understand its implementation. 

In C, defining a function is almost the only way of creating abstraction, and a function is not enough 
to define a module. However, it is possible to write modular software in C by grouping together several 
function definitions in one or several files, and grouping all the functions declarations in one header file 
that defines the interface. 

CONFIGEN then helps to make these modules optional, to manage different implementations of the 
same module, to state dependencies between modules, and to automate their build. Moreover, module 
dependencies are an important information on how the software is built and how its modules interact, 
and CONFIGEN graphical output is very useful as a documentation. This helps in making the source code 
self-documenting, an important principle for understandable code (especially in open source software). 

3.2 Optional behavior in small pieces of code 

There are some configuration options that affect small pieces of code, typically something too small to 
write a specific module. For instance, our scheduler has an optional optimization that requires a small 
calculation in order to avoid sending an inter-processor interrupt. The C code looks like: 

#ifdef CONFIG_OPTIMIZE_SEND_IPI 
if( do_calculation()) return; 
#endif 
send_IPI() ; 

The approach we advocate using CONFIGEN is to define optimize_send_ipi as an interface with 
two implementations (optiniize_send_ipi_yes and optimize_send_ipi_no), and make sched de- 
pend on it. The symbol CDNFIG_OPTIMIZE_SEND_IPI will then be either defined or "un-defined", de- 
pending on the chosen implementation. For convenience, the "yes/no" implementations are automatically 
declared in the deps file when a symbol name ends with a question mark. The final deps file is then: 

sched -> optimize_send_ipi? 

# (auto) optiinize_send_ipi? : optiiiiize_send_ipi_yes | optimize_send_ipi_no 

This kind of optional behavior is not restricted to yes/no choices, and this scheme accommodates to 
any number of options. 

3.3 Optional use of a module 

Sometimes the use of a module can be optional. For instance, when porting OASIS to a new platform, 
we do not implement memory protection in the early stages of development, in order to quickly get to 
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functional kernel. The recipe in the previous section can be used in this case; basically, it consists of 
surrounding all uses of a module by #if def CONFIG.USE .... 

This raises a few problems though. First, it leads to many uses of preprocessor conditionals in the 
code, which makes it less readable. Second, if the module is used in different places, all of them places 
are impacted by the conditional use of the module. 

It is better to create, for this module M, a new irnplementation M_empty, in which all the functions do 
nothing (or are replaced by macros that do nothing^ This leads to less configuration options, less code, 
and code easier to read. 

3.4 Dividing configuration options 

Options should be split into the smallest possible pieces. One could think that too much splitting of 
options would lead to a proliferation of configuration options, and would make options dependencies 
difficult to understand. 

On the contrary, having many options and modules makes their meaning more precise. Each config- 
uration option names a concept of the application, and giving names to precise concepts helps greatly in 
their understanding. Moreover, it makes the system more modular. 

As an example, the OASIS micro-kernel defines a date configuration option that accepts three dif- 
ferent values: datel6, date32, and date64, which sets the size of a date_t integer type. The original 
code was written with the assumption that 16 and 32 bits date_t may lead to a date overflow, whereas a 
64 bit field may not; therefore all the overflow-handling code was enclosed by #if def ined( DATE16) 
I I def ined(DATE32) directives, which does not seem appropriate at first glance. 

We reworked this using CONFIGEN, and here is the result: 

date -> date_size & date_overf lows? 

# (auto) date_overf lows? : date_overf lows_yes | date_overflows_no 

date_size: datel6 I date32 | date64 

datel6 -> date_overflows_yes; date32 -> date_overflows_yes 

date64 -> date_overflows_no 

Even if we added new configuration options, the resulting code is easier to understand, as the "over- 
flow" concept is named and assumptions are explicit. It is also more modular, as we could easily change 
the code to allow 32 bit dates that do not overflow. 



3.5 Testing code 

When writing a unit test for a module M, some code has to be activated to test M (e.g. calls to M and 
checking of M results), and some code has to be activated to satisfy M dependencies (for instance, 
unit testing of our scheduler requires a special version of the context switching functions, that only log 
context-switch operations). 

So far we found Configen to be of great help to automate the activation of these requirements. . 
However there is still some work to do to improve this automation. 

Note that this much easier to achieve if the interface only expose functions, and not global variables; this is one of the 
reasons why it is preferable to hide global variables with static and use accessor/mutator functions. 
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4 The CONFIGEN prototype 

4.1 The CONFIGEN script 

Aside from the logic solver, CONFIGEN is an extremely simple tool, composed of less than 400 lines of 
Ruby code. Yet this tool does parse the deps file, implements the graph user interface, interacts with the 
solver, and outputs the conf ig . h and conf ig . mk files. 

The parser, that builds the dependency graph from the deps file, was really easy to write because our 
syntax defines a regular language, and thus can be parsed easily using simple regular expressions. 

To avoid the tedious development of a complex HMI, we use a graphviz|j feature that can output 
images and HTML image maps such that clicking on a node would send different HTTP requests. So all 
CONFIGEN has to do is to output the graph to the dot file format with the correct options, implement a 
small web server to handle the different "node clicked" requests, communicate with the logic solver, and 
ask graphviz to do all the redisplay work with the result of the solver. This way, a standard web browser 
stands for the graphical interface. Printing the conf ig . h and conf ig . mk files was just trivial scripting. 

4.2 The logic solver 

Every time the user clicks on a node, he sets the corresponding configuration symbol (i.e. logical literal) 
to a truth value, sequentially TRUE (1), then FALSE (0), then back to the "unset" state. The idea is, after 
each click, to infer which configuration symbols have to be TRUE or FALSE subsequently to the user 
action. 

4.2.1 Formal definition of the problem 

Let us define the following notations: 

• X = {xi,...,Xn) is the set of literals defined in the deps file, and f{xi ,... ,x„) the boolean expres- 
sion corresponding to the dependency graph. 

• J2/ is a boolean clause defining the partial truth assignment, as defined by the clicks of the user. 
We note Ui dX (resp. Uq C X) the set of literals forced to 1 (resp. to 0) by the user. For the rest 
of this section, we assume without loss of generality that the literals are ordered as follows: 

xi , ... , Xn, , Xm+\ , ... , Xp , Xp+\ , ... , Xn with m<p<n 

V ' ^ ^ ' 

=Uo =Ui 



Therefore we have: £/ = I /\ -ijc,- j A ( f\ Xj \ 



^iG{l,...,m} ^ ^je{m+\,...,p} 

• Let f^ = / A £/. I.e. fj^/ is the function obtained after setting in the expression of / all the literals 
in Ui and L'^o- 
Then our problem is to find the biggest subsets ^o and Si of {xp+i, ...,Xn} such that So n Si =0 and: 

Vx G So, {/a/ =^ -ix} is a tautology 

Vx G Si , {fs/ =^ x} is a tautology 

Theorem 1. The problem of finding whether a given literal is in Si (resp. So) is co-NP-complete. 



An open-source graph vizualization software: http : //www . graphviz . org! 
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Before proving this theorem, let us show the following result: 
Lemma 1. The satisfiability problem (P) of a boolean expression f{x\,. . . ,x„) described by the deps 



file (whose operators are described in section 2.2.1) is NP-complete. 



Proof of Lemma^ Let us prove first that (P) is a NP-hard problem. For that purpose, we can easily 
reduce any 3-SAT instance, a well-known NP-complete problem, to a formula such as /. Indeed, each 
clause (ji \/y2\/y3) of a 3-CNF expression can be written using the interface/implementations operator 



as: ~'(0: (>'i|3'2b3)) (see the footnote on p.33 for the expression of the "-<" operator and the "0" Boolean 
constant with our operators). As this transformation can clearly be processed in polynomial time, (P) is 
then NP-hard. 

But (P) is also in NP. Indeed, an algorithm that non-deterministically chooses the Boolean value of 
each literal (xi, . . . ,x„) can easily decide in polynomial time if the formula f{xi,. . . ,x„) is true for the 
chosen valuation. 

Since (P) is both NP and NP-hard, it is NP-complete. D 

Proof of Theorem^ We will prove this theorem for literals in ^i; the case of literals belonging to ^o is 
almost identical. 

Provided a partial truth assignment £/ = ( A;e{i m} ~^^i ) ^ ( /\je{m+i,...,p}^j ) ^^^ ^^ unvalued lit- 
eral xii (with ^ G {m + 1 , . . . , n}), let us note (P') the problem of deciding if f{xi ,... ,x„) A £/ A-ix,t is 

^ V 

satisfiable. Note that (/^ =^ x^) is a tautology iff f^y A -ixj- is not satisfiable. Therefore, proving that the 
complement problem (P') is NP-complete will prove the theorem. 

We can reduce in polynomial time any problem (P) to a (P') problem by extending the set of literals 
addressed by (P). Let y ^X and z^Xhe any two literals, then the following formula is an instance of 
(P') with n + 2 variables: 

/(xi , . . . ,x„) A (y ^ y) A (z ^ z)^ A y A^z 

F{y,z,xi,...,x„) ^ 

With the previous notations, we have in this case Uq = Q,Ui = {y}, s/ = y. If this formula is satisfiable, 
then so is /; conversely, if / is satisfiable, the above formula is also satisfiable (with y set to "true" and 
z set to 'false"). (P') is thus a NP-hard problem. For the same reasons than for (P) (see LemmaUl, it is 
also NP-complete. 

Therefore, deciding if a literal belongs to ^i is indeed a co-NP-complete problem. D 

We have proved here that in the most general case, the problem addressed is co-NP complete. How- 
ever, deciding if f^/ =^ x^ is a tautology is meaningful only if f^ is satisfiable itself, i.e. if the logical 
description of the system is "coherent", and if the options chosen so far by the user are not contradictory. 
Therefore, another approach would be to ensure this property first with a regular SAT-solver, then to 
search if f^ A -ix^ is satisfiable or not. This last step is probably easier than a co-NP problem. Moreover, 
once / has been proved to be satisfiable, it should be easier to prove that f^ is satisfiable every time the 
user iteratively appends new literals to ^ by clicking. 

4.2.2 Internals of the solver 

Our logic solver applies a simple and intuitive heuristic to compute subsets of Sq and Si. The idea is to 
compute and simplify the fg/ expression for each assignment £/ provided by the user, then to convert 
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it to a CNF form using a straightforward algorithmj^ Then, all clauses of the final expression that are 
literals (resp. negated literals) belong to Si (resp. ^o). 

To tliis purpose, the dependency graph expressed in deps is translated into a boolean expression that 
only uses A, V, and literal -i|j The formal simplifier can then manipulate this boolean expression and 
apply the following basic logic rules: 

a:A1=a; ;cAO = ;c A/(x, . . .) =^ A/(1,. . .) ^xAf{x,...) = ^xAf{0,...) 
;cVl = l xVO=x xV/(x,...) =xV/(0,...) ^x V/(x, . . .) = ^xV/(l,. . .) 
before converting the result to a CNF. The literals and negated literals clauses are then extracted and sent 
back to the Ruby script for display. 

In most cases, our solver managed to find the whole subsets ^o and ^i. In some however, it failed to 



see all dependencies. For instance, in the example of figure 1(b) p 35 our solver was actually unable to 
deduce from {arm = I, sched =1} that Use (and subsequently llsc.arm) is always truer] The reason 
for this failure is that the policy described above is not sufficient to deduce from: 

{a®b)A{a^c)A{b^c) = (aVb) A{^b\/ ^a) A{^a\/ c) A{^b\/ c) 

that c is necessarily always true. 

Even if the solver shows its limitations, it remains correct in the sense that it will never deduce an 
erroneous literal value, e.g. that would define unwanted options, or that would result in an unsatisfiable 
expression. 

The solver is written in C; its performances were not measured precisely, but for all the dependencies 
trees that we used so far to model the OASIS kernel, the calculus time appeared immediate. It has not 
been tested yet on larger scale projects, mainly because it is still an "ad-hoc" tool, that requires a lot of 
improvements before being subject to a relevant performance evaluation. 

Evolutions will be discussed in section [6] Although the solver approach is obviously not suitable 
for (even approximate) solving of SAT-problems, (especially when compared to dedicated tools such as 
MiniSat [6|), we believe we can make it more efficient by focusing only on a meaningful restricted set 
of boolean expressions, e.g. only those represented by an acyclic graph. 

5 Related works 

Using boolean logic to manage and validate complex dependencies schemes has been done before in 
different application domains, including software architecture. 

Our development approach is close to the Software Product Line engineering technique fSl, as it 
promotes modularity and re-usability as development-driving key concepts. The features of a SPL are 
usually represented as an oriented graph (feature diagram). The semantics of this graph has been for- 
malized and studied Q, although to the best of our knowledge it is not used in any practical application. 

The Kconfig Linux kernel configurator is a tool similar to ours. With the use of Kconfig script files 
and a dedicated syntax, the kernel developers have a powerful and efficient way to express internal de- 
pendencies. The user has therefore a great freedom in the choice of his kernel components (see Sincero's 
work inn - an attempt to bridge the gap between the SPL and the Open Source communities, and the 
Linux Kernel documentatioirl). However no graph representation of the dependencies is provided, nor 



The algorithm consists in recursively applying distributivity property of the A operator. 

By literal -• we mean that the operator may only be applied to boolean variables, not to operators. E.g. --{aAb) is 
prohibited, whereas (-la) V {^b) is not. 

^Indeed, the mandatory choice of one option in clock as well as in ctxlist will either set Use directly, or will set 
spinlock, then spinlock_llsc which is the last available choice, then Use. 
Ihttp: //www. kernel. orgt see Documentation/kbuild/ directory 
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any interactive interface such as ours. 

Tiie main link between software components dependencies and propositional logic comes from the 
work of Mancinelli, Abate, Boender and Di Cosmo on Free Open-Source Software distributions, through 
the EDOS and Mancoosi projects [12J. In [10|, a SAT-solver is used to address the installability prob- 
lem of a set of packages; in lU, the dependency graph of a package repository is analyzed to identify 
"sensible" components that may widely impact the system if corrupted or removed. 

6 Conclusion 

This paper has presented CONFIGEN, a tool for managing software configuration options. We exposed 
the concepts behind CONFIGEN, showed how it can be integrated in a software development project, 
and described a set of good practices and examples that come from our experience using CONFIGEN for 
system development. We also presented the graphical interface of CONFIGEN, the associated logic solver 
and the theoretical problem it addresses. 

CONFIGEN is still a prototype, but has already proved to be very useful. The tools are simple to use 
and have helped in refactoring a complex software, making it easier to understand. It also encourages 
good software practices. We found the graphical interface of great help when defining new modules. 

There are many future possible developments. It might be interesting to extract the dependency file 
from the source code, using source code annotations for instance. Another interesting point would be 
to guide the user's choices with "automatic" implementations, that would discard by default rarely used 
options (e.g. benchmarking modules). At last, the work of [ 1 1 could also be used to isolate critical 
features, with application to quality assurance. 

Many interesting developments also remain to be done in the solver. The problem we need to solve 
is co-NP-complete in the general case; however we did not take into account many restrictions yet. 
For instance, the current proof uses implementation/interfaces operators in which some implementations 
belong to multiple interfaces, which does not happen in real use. Moreover, we do not use cycles in 
the use cases encountered so far. It is possible that with such restrictions, the problem we try to solve 
becomes polynomial. Even if it is not, there are strong relationships between successive iterations of 
the problem to solve (i.e. they differ by only one truth assignment), which could be exploited by an 
incremental solver. Meanwhile, it would be more appropriate to connect to a SAT-solver and get the 



complete solution, as suggested in 4.2 Such solvers could also be used to get the set of all the possible 



configurations, e.g. for testing purposes. 
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